Search CORE

129 research outputs found

LiveSketch: Query Perturbations for Guided Sketch-based Visual Search

Author: Bui Tu
Collomosse John
Jin Hailin
Publication venue
Publication date: 01/01/2019
Field of study

LiveSketch is a novel algorithm for searching large image collections using hand-sketched queries. LiveSketch tackles the inherent ambiguity of sketch search by creating visual suggestions that augment the query as it is drawn, making query specification an iterative rather than one-shot process that helps disambiguate users' search intent. Our technical contributions are: a triplet convnet architecture that incorporates an RNN based variational autoencoder to search for images using vector (stroke-based) queries; real-time clustering to identify likely search intents (and so, targets within the search embedding); and the use of backpropagation from those targets to perturb the input stroke sequence, so suggesting alterations to the query in order to guide the search. We show improvements in accuracy and time-to-task over contemporary baselines using a 67M image corpus.Comment: Accepted to CVPR 201

arXiv.org e-Print Archive

Crossref

University of Surrey

Surrey Research Insight

Special section on Non-Photorealistic Animation and Rendering (NPAR) 2010

Author: Collomosse John
Isenberg Tobias
Publication venue
Publication date: 01/01/2011
Field of study

International audienceEditoria

HAL-CentraleSupelec

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

INRIA a CCSD electronic archive server

University of Groningen Digital Archive

Dissertations of the University of Groningen

HAL-Rennes 1

Robust Synthesis of Adversarial Visual Examples Using a Deep Image Prior

Author: Collomosse John
Gittings Thomas
Schneider Steve
Publication venue
Publication date: 03/07/2019
Field of study

We present a novel method for generating robust adversarial image examples building upon the recent `deep image prior' (DIP) that exploits convolutional network architectures to enforce plausible texture in image synthesis. Adversarial images are commonly generated by perturbing images to introduce high frequency noise that induces image misclassification, but that is fragile to subsequent digital manipulation of the image. We show that using DIP to reconstruct an image under adversarial constraint induces perturbations that are more robust to affine deformation, whilst remaining visually imperceptible. Furthermore we show that our DIP approach can also be adapted to produce local adversarial patches (`adversarial stickers'). We demonstrate robust adversarial examples over a broad gamut of images and object classes drawn from the ImageNet dataset.Comment: Accepted to BMVC 201

arXiv.org e-Print Archive

University of Surrey

Audio-Visual Contrastive Learning with Temporal Self-Supervision

Author: Black Alexander
Collomosse John
Jenni Simon
Publication venue
Publication date: 15/02/2023
Field of study

We propose a self-supervised learning approach for videos that learns representations of both the RGB frames and the accompanying audio without human supervision. In contrast to images that capture the static scene appearance, videos also contain sound and temporal scene dynamics. To leverage the temporal and aural dimension inherent to videos, our method extends temporal self-supervision to the audio-visual setting and integrates it with multi-modal contrastive objectives. As temporal self-supervision, we pose playback speed and direction recognition in both modalities and propose intra- and inter-modal temporal ordering tasks. Furthermore, we design a novel contrastive objective in which the usual pairs are supplemented with additional sample-dependent positives and negatives sampled from the evolving feature space. In our model, we apply such losses among video clips and between videos and their temporally corresponding audio clips. We verify our model design in extensive ablation experiments and evaluate the video and audio representations in transfer experiments to action recognition and retrieval on UCF101 and HMBD51, audio classification on ESC50, and robust video fingerprinting on VGG-Sound, with state-of-the-art results.Comment: AAAI-2

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Higher level techniques for the artistic rendering of images and video

Author: Collomosse John Philip
Publication venue
Publication date: 01/01/2004
Field of study

EThOS - Electronic Theses Online ServiceGBUnited Kingdo

OPUS

University of Surrey

Surrey Research Insight

OpenGrey Repository

PARASOL: Parametric Style Control for Diffusion Image Synthesis

Author: Bui Tu
Collomosse John
Ruta Dan
Tarrés Gemma Canet
Publication venue
Publication date: 27/03/2023
Field of study

We propose PARASOL, a multi-modal synthesis model that enables disentangled, parametric control of the visual style of the image by jointly conditioning synthesis on both content and a fine-grained visual style embedding. We train a latent diffusion model (LDM) using specific losses for each modality and adapt the classifier-free guidance for encouraging disentangled control over independent content and style modalities at inference time. We leverage auxiliary semantic and style-based search to create training triplets for supervision of the LDM, ensuring complementarity of content and style cues. PARASOL shows promise for enabling nuanced control over visual style in diffusion models for image creation and stylization, as well as generative search where text-based search results may be adapted to more closely match user intent by interpolating both content and style descriptors.Comment: Added Appendi

arXiv.org e-Print Archive